智能论文笔记

Using scientific machine learning for experimental bifurcation analysis of dynamic systems

Sandor Beregi , David A. W. Barton , Djamel Rezgui , Simon A. Neild

分类：机器学习

2021-10-22

使用机器学习结构的增强机械常微分方程（ODE）模型是一种新颖的方法，可以通过测量数据创建专业知识和现实的高精度，低维模型。我们的探索性研究侧重于培训具有限制循环的物理非线性动力系统的通用微分方程（UDE）模型：勃起振动振荡和电动非线性振荡器的空心罐。我们考虑通过数值模拟产生培训数据的示例，而我们还将建议的建模概念应用于物理实验，允许我们研究各种复杂性的问题。要收集培训数据，因此使用基于控制的延续的方法，因为它不仅捕获稳定，而且使用观察到的系统的不稳定限制周期。此功能使得可以提取有关观察系统的更多信息，而不是标准，开环方法允许。我们使用神经网络和高斯过程作为通用近似器，以及机械模型，对UDE建模方法的准确性和稳健性进行了关键评估。我们还突出显示可能在培训过程中遇到的潜在问题，指示当前建模框架的限制。

translated by 谷歌翻译

Statistical Design and Analysis for Robust Machine Learning: A Case Study from COVID-19

Davide Pigoli , Kieran Baker , Jobie Budd , Lorraine Butler , Harry Coppock , Sabrina Egglestone , Steven G. Gilmour , Chris Holmes , David Hurley , Radka Jersakova

分类：机器学习

2022-12-15

Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously assesses state-of-the-art machine learning techniques used to predict COVID-19 infection status based on vocal audio signals, using a dataset collected by the UK Health Security Agency. This dataset includes acoustic recordings and extensive study participant meta-data. We provide guidelines on testing the performance of methods to classify COVID-19 infection status based on acoustic features and we discuss how these can be extended more generally to the development and assessment of predictive methods based on public health datasets.

translated by 谷歌翻译

Interpretable ML for Imbalanced Data

Damien A. Dablain , Colin Bellinger , Bartosz Krawczyk , David W. Aha , Nitesh V. Chawla

分类：机器学习

2022-12-15

Deep learning models are being increasingly applied to imbalanced data in high stakes fields such as medicine, autonomous driving, and intelligence analysis. Imbalanced data compounds the black-box nature of deep networks because the relationships between classes may be highly skewed and unclear. This can reduce trust by model users and hamper the progress of developers of imbalanced learning algorithms. Existing methods that investigate imbalanced data complexity are geared toward binary classification, shallow learning models and low dimensional data. In addition, current eXplainable Artificial Intelligence (XAI) techniques mainly focus on converting opaque deep learning models into simpler models (e.g., decision trees) or mapping predictions for specific instances to inputs, instead of examining global data properties and complexities. Therefore, there is a need for a framework that is tailored to modern deep networks, that incorporates large, high dimensional, multi-class datasets, and uncovers data complexities commonly found in imbalanced data (e.g., class overlap, sub-concepts, and outlier instances). We propose a set of techniques that can be used by both deep learning model users to identify, visualize and understand class prototypes, sub-concepts and outlier instances; and by imbalanced learning algorithm developers to detect features and class exemplars that are key to model performance. Our framework also identifies instances that reside on the border of class decision boundaries, which can carry highly discriminative information. Unlike many existing XAI techniques which map model decisions to gray-scale pixel locations, we use saliency through back-propagation to identify and aggregate image color bands across entire classes. Our framework is publicly available at \url{https://github.com/dd1github/XAI_for_Imbalanced_Learning}

translated by 谷歌翻译

A large-scale and PCR-referenced vocal audio dataset for COVID-19

Jobie Budd , Kieran Baker , Emma Karoune , Harry Coppock , Selina Patel , Ana Tendero Cañadas , Alexander Titcomb , Richard Payne , David Hurley , Sabrina Egglestone

分类：机器学习

2022-12-15

The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the 'Speak up to help beat coronavirus' digital survey alongside demographic, self-reported symptom and respiratory condition data, and linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,794 of 72,999 participants and 24,155 of 25,776 positive cases. Respiratory symptoms were reported by 45.62% of participants. This dataset has additional potential uses for bioacoustics research, with 11.30% participants reporting asthma, and 27.20% with linked influenza PCR test results.

translated by 谷歌翻译

Design of a Parallel Elastic Actuator with a Continuously-Adjustable Equilibrium Position

Evangelos Chatziandreou , Chase W. Mathews , David J. Braun

分类：机器人

2022-12-15

In this paper, we present an adjustable-equilibrium parallel elastic actuator (AE-PEA). The actuator consists of a motor, an equilibrium adjusting mechanism, and a spring arranged into a cylindrical geometry, similar to a motor-gearbox assembly. The novel component of the actuator is the equilibrium adjusting mechanism which (i) does not require external energy to maintain the equilibrium position of the actuator even if the spring is deformed and (ii) enables equilibrium position control with low energy cost by rotating the spring while keeping it undeformed. Adjustable equilibrium parallel elastic actuators resolve the main limitation of parallel elastic actuators (PEAs) by enabling energy-efficient operation at different equilibrium positions, instead of being limited to energy-efficient operation at a single equilibrium position. We foresee the use of AE-PEAs in industrial robots, mobile robots, exoskeletons, and prostheses, where efficient oscillatory motion and gravity compensation at different positions are required.

translated by 谷歌翻译

PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation

Maxwell A. Xu , Alexander Moreno , Supriya Nagesh , V. Burak Aydemir , David W. Wetter , Santosh Kumar , James M. Rehg

分类：机器学习 | 人工智能

2022-12-14

The promise of Mobile Health (mHealth) is the ability to use wearable sensors to monitor participant physiology at high frequencies during daily life to enable temporally-precise health interventions. However, a major challenge is frequent missing data. Despite a rich imputation literature, existing techniques are ineffective for the pulsative signals which comprise many mHealth applications, and a lack of available datasets has stymied progress. We address this gap with PulseImpute, the first large-scale pulsative signal imputation challenge which includes realistic mHealth missingness models, an extensive set of baselines, and clinically-relevant downstream tasks. Our baseline models include a novel transformer-based architecture designed to exploit the structure of pulsative signals. We hope that PulseImpute will enable the ML community to tackle this significant and challenging task.

translated by 谷歌翻译

E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model

W. Ronny Huang , Shuo-Yiin Chang , Tara N. Sainath , Yanzhang He , David Rybach , Robert David , Rohit Prabhavalkar , Cyril Allauzen , Cal Peyser , Trevor D. Strohman

分类：自然语言处理

2022-11-28

We explore unifying a neural segmenter with two-pass cascaded encoder ASR into a single model. A key challenge is allowing the segmenter (which runs in real-time, synchronously with the decoder) to finalize the 2nd pass (which runs 900 ms behind real-time) without introducing user-perceived latency or deletion errors during inference. We propose a design where the neural segmenter is integrated with the causal 1st pass decoder to emit a end-of-segment (EOS) signal in real-time. The EOS signal is then used to finalize the non-causal 2nd pass. We experiment with different ways to finalize the 2nd pass, and find that a novel dummy frame injection strategy allows for simultaneous high quality 2nd pass results and low finalization latency. On a real-world long-form captioning task (YouTube), we achieve 2.4% relative WER and 140 ms EOS latency gains over a baseline VAD-based segmenter with the same cascaded encoder.

translated by 谷歌翻译

Development of a Modular and Submersible Soft Robotic Arm and Corresponding Learned Kinematics Models

W. David Null , YZ

分类：机器人 | 人工智能 | 机器学习

2022-09-19

自然界中发现的大多数软体体生物都存在于水下环境中。研究水下软机器人的运动和控制也很有帮助。但是，由于难以设计，制造和防水，因此无法使用容易获得的水下软机器人系统。此外，由于需要密封的电子包，因此潜水机器人通常没有可配置的组件。这项工作介绍了由液压执行器驱动的潜水软机器人手臂的开发，该臂主要由3D可打印的零件组成，可以在短时间内组装。此外，它的模块化设计可实现多种形状配置和轻松交换软执行器。作为探索该系统上机器学习控制算法的第一步，开发，训练和评估了两个深神网络模型，以估算机器人的前进和逆运动学。用于控制这种水下软机器人臂的技术可以帮助促进对如何控制软机器人系统的理解。

translated by 谷歌翻译

Where is VALDO? VAscular Lesions Detection and segmentatiOn challenge at MICCAI 2021

Carole H. Sudre , Kimberlin Van Wijnen , Florian Dubost , Hieab Adams , David Atkinson , Frederik Barkhof , Mahlet A. Birhanu , Esther E. Bron , Robin Camarasa , Nish Chaturvedi

分类：计算机视觉 | 人工智能

2022-08-15

脑小血管疾病的成像标记提供了有关脑部健康的宝贵信息，但是它们的手动评估既耗时又受到实质性内部和间际变异性的阻碍。自动化评级可能受益于生物医学研究以及临床评估，但是现有算法的诊断可靠性尚不清楚。在这里，我们介绍了\ textIt {血管病变检测和分割}（\ textit {v textit {where valdo？}）挑战，该挑战是在国际医学图像计算和计算机辅助干预措施（MICCAI）的卫星事件中运行的挑战（MICCAI） 2021.这一挑战旨在促进大脑小血管疾病的小而稀疏成像标记的自动检测和分割方法的开发，即周围空间扩大（EPVS）（任务1），脑微粒（任务2）和预先塑造的鞋类血管起源（任务3），同时利用弱和嘈杂的标签。总体而言，有12个团队参与了针对一个或多个任务的解决方案的挑战（任务1 -EPVS 4，任务2 -Microbleeds的9个，任务3 -lacunes的6个）。多方数据都用于培训和评估。结果表明，整个团队和跨任务的性能都有很大的差异，对于任务1- EPV和任务2-微型微型且对任务3 -lacunes尚无实际的结果，其结果尤其有望。它还强调了可能阻止个人级别使用的情况的性能不一致，同时仍证明在人群层面上有用。

translated by 谷歌翻译

Rapid Exploration of a 32.5M Compound Chemical Space with Active Learning to Discover Density Functional Approximation Insensitive and Synthetically Accessible Transitional Metal Chromophores

Chenru Duan , Aditya Nandy , Gianmarco Terrones , David W. Kastner , Heather J. Kulik

分类：机器学习

2022-08-10

机器学习（ML）加速化学发现的两个突出挑战是候选分子或材料的合成性以及ML模型训练中使用的数据的保真度。为了应对第一个挑战，我们构建了一个假设的设计空间，为3250万转型金属复合物（TMC），其中所有组成片段（即金属和配体）和配体对称性都可以合成。为了应对第二项挑战，我们在雅各布梯子的多个梯级之间的23个密度功能近似之间搜索预测的共识。为了加快这3250万TMC的筛选，我们使用有效的全局优化来样本候选低自旋发色团，同时具有低吸收能和低静态相关性。尽管在这个大化的化学空间中的潜在发色团缺乏（即$ <$ 0.01 \％），但随着ML模型在积极学习过程中的改善，我们确定了高可能性（即$> $ 10 \％）的过渡金属发色团（即$> $ 10 \％）。这代表发现的1,000倍加速度，与几天而不是几年中的发现相对应。对候选发色团的分析揭示了对CO（III）和具有更大键饱和度的大型强野配体的偏爱。我们根据时间依赖性密度功能理论计算计算帕累托前沿上有希望的发色团的吸收光谱，并验证其中三分之二是否需要激发态特性。尽管这些复合物从未经过实验探索，但它们的组成配体在文献中表现出有趣的光学特性，体现了我们构建现实的TMC设计空间和主动学习方法的有效性。

translated by 谷歌翻译